Canonical, stable, general mapping using context schemes

نویسندگان

  • Adam M. Novak
  • Yohei Rosen
  • David Haussler
  • Benedict Paten
چکیده

MOTIVATION Sequence mapping is the cornerstone of modern genomics. However, most existing sequence mapping algorithms are insufficiently general. RESULTS We introduce context schemes: a method that allows the unambiguous recognition of a reference base in a query sequence by testing the query for substrings from an algorithmically defined set. Context schemes only map when there is a unique best mapping, and define this criterion uniformly for all reference bases. Mappings under context schemes can also be made stable, so that extension of the query string (e.g. by increasing read length) will not alter the mapping of previously mapped positions. Context schemes are general in several senses. They natively support the detection of arbitrary complex, novel rearrangements relative to the reference. They can scale over orders of magnitude in query sequence length. Finally, they are trivially extensible to more complex reference structures, such as graphs, that incorporate additional variation. We demonstrate empirically the existence of high-performance context schemes, and present efficient context scheme mapping algorithms. AVAILABILITY AND IMPLEMENTATION The software test framework created for this study is available from https://registry.hub.docker.com/u/adamnovak/sequence-graphs/. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome analysis Canonical, stable, general mapping using context schemes

Motivation: Sequence mapping is the cornerstone of modern genomics. However, most existing sequence mapping algorithms are insufficiently general. Results: We introduce context schemes: a method that allows the unambiguous recognition of a reference base in a query sequence by testing the query for substrings from an algorithmically defined set. Context schemes only map when there is a unique b...

متن کامل

Acceptable Programs Revisited

Acceptable logic programs have been studied extensively in the context of proving termination of Prolog programs. It is diicult, however, to establish acceptability from the deenition since this depends on nding a suitable model, which need not be a Herbrand model in general, together with a suitable level mapping that one can use to check the conditions which characterize acceptability. In thi...

متن کامل

Stable Varieties with a Twist

1.1. Moduli of stable varieties: the case of surfaces. In the paper [KSB88], Kollár and Shepherd-Barron introduced stable surfaces as a generalization of stable curves. This class is natural from the point of view of the minimal model program, which shows that any one-parameter family of surfaces of general type admits a unique stable limit. Indeed, the stable reduction process of Deligne and M...

متن کامل

Stability of 2 Hilbert points of canonical curves

Such an interpretation could then be used to study properties of the rational contraction Mg 99K Mg(α) and to obtain structural results about the cone of effective divisors of Mg, in particular its Mori chamber decomposition. Hassett and Hyeon constructed the first two log canonical models of Mg by considering GIT quotients of asymptotically linearized Hilbert schemes of tricanonical and bicano...

متن کامل

A Conservative Mesh-Free Scheme and Generalized Framework for Conservation Laws

We present a novel mesh-free scheme for solving partial differential equations. We first derive a conservative and stable formulation of mesh-free first derivatives. We then show that this formulation is a special case of a general conservative mesh-free framework that allows flexible choices of flux schemes. Necessary conditions and algorithms for calculating the coefficients for our mesh-free...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 31 22  شماره 

صفحات  -

تاریخ انتشار 2015